A Common Access Structure for Standard Attributes and Document Representations in Vector Space
نویسنده
چکیده
In next generation information systems there will be a coalescence of (object-oriented) database management systems and information retrieval systems. Especially the integration of content-based retrieval techniques into object-oriented database management systems is an interesting requirement in this respect. Whereas the aspects of this coalescence dealing with the query language and the data model have been addressed in some recent papers, approaches dealing with the integration at the physical level are missing. In the present paper we propose a common access structure which can support a content-based similarity search with additional conditions on standard attributes in one homogeneous step. To this end, we use a k-dtree based multi-attribute access structure which considers standard attributes in the rst dimensions and the components of a document description vector in the higher dimensions. We describe the algorithms for this access structure and present rst performance results.
منابع مشابه
A Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملAn Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملAn Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملSpace Vector Pulse Width Modulation with Reduced Common Mode Voltage and Current Losses for Six-Phase Induction Motor Drive with Three-Level Inverter
Common-mode voltage (CMV) generated by the inverter causes motor bearing failures in multiphase drives.On the other hand, presence of undesired z-component currents in six-phase induction machine (SPIM) leads to extra current losses and have to be considered in pulse width modulation (PWM) techniques. In this paper, it is shown that the presence of z-component currents and CMV in six phase driv...
متن کاملSpace Vector Modulation Technique to Reduce Leakage Current of a Transformerless Three-Phase Four-Leg Photovoltaic System
Photovoltaic systems integrated to the grid have received considerable attention around the world. They can be connected to the electrical grid via galvanic isolation (transformer) or without it (transformerless). Despite making galvanic isolation, low frequency transformer increases size, cost and losses. On the other hand, transformerless PV systems increase the leakage current (common-mode c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997